Deep Analysis of Binary Code to Recover Program Structure

نویسنده

  • Khaled ElWazeer
چکیده

Title of dissertation: DEEP ANALYSIS OF BINARY CODE TO RECOVER PROGRAM STRUCTURE Khaled ElWazeer, Doctor of Philosophy, 2014 Dissertation directed by: Professor Rajeev Barua Department of Electrical and Computer Engineering Reverse engineering binary executable code is gaining more interest in the research community. Agencies as diverse as anti-virus companies, security consultants, code forensics consultants, law-enforcement agencies and national security agencies routinely try to understand binary code. Engineers also often need to debug, optimize or instrument binary code during the software development process. In this dissertation, we present novel techniques to extend the capabilities of existing binary analysis and rewriting tools to be more scalable, handling a larger set of stripped binaries with better and more understandable outputs as well as ensuring correct recovered intermediate representation (IR) from binaries such that any modified or rewritten binaries compiled from this representation work correctly. In the first part of the dissertation, we present techniques to recover accurate function boundaries from stripped executables. Our techniques as opposed to current techniques ensure complete live executable code coverage, high quality recovered code, and functional behavior for most application binaries. We use static and dynamic based techniques to remove as much spurious code as possible in a safe manner that does not hurt code coverage or IR correctness. Next, we present static techniques to recover correct prototypes for the recovered functions. The recovered prototypes include the complete set of all arguments and returns. Our techniques ensure correct behavior of rewritten binaries for both internal and external functions. Finally, we present scalable and precise techniques to recover local variables for every function obtained as well as global and heap variables. Different techniques are represented for floating point stack allocated variables and memory allocated variables. Data type recovery techniques are presented to declare meaningful data types for the detected variables. Our data type recovery techniques can recover integer, pointer, structural and recursive data types. We discuss the correctness of the recovered representation. The evaluation of all the methods proposed is conducted on SecondWrite, a binary rewriting framework developed by our research group. An important metric in the evaluation is to be able to recompile the IR with the recovered information and run it producing the same answer that is produced when running the original executable. Another metric is the analysis time. Some other metrics are proposed to measure the quality of the IR with respect to the IR with source code information available. DEEP ANALYSIS OF BINARY CODE TO RECOVER PROGRAM STRUCTURE

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Reverse Engineering of Network Software Binary Codes for Identification of Syntax and Semantics of Protocol Messages

Reverse engineering of network applications especially from the security point of view is of high importance and interest. Many network applications use proprietary protocols which specifications are not publicly available. Reverse engineering of such applications could provide us with vital information to understand their embedded unknown protocols. This could facilitate many tasks including d...

متن کامل

TIE: Principled Reverse Engineering of Types in Binary Programs

A recurring problem in security is reverse engineering binary code to recover high-level language data abstractions and types. High-level programming languages have data abstractions such as buffers, structures, and local variables that all help programmers and program analyses reason about programs in a scalable manner. During compilation, these abstractions are removed as code is translated d...

متن کامل

An Analysis of the Deep Structure of Malekan-e- Azab and Its Relationship with the Author’s Techniques

The novel Malekan-e- Azab is the latest published work by Abu Torab-e Khosravi. The significance which is always assumed by Khosravi for words and writing is conspicuously seen in this novel as well; however, the author’s attempt to develop novel atmosphere and techniques has added a new dimension of complexity. To fully appreciate this complication and to gain an understanding of the dis...

متن کامل

Binary Obfuscation Using Signals

Reverse engineering of software is the process of recovering higher-level structure and meaning from a lowerlevel program representation. It can be used for legitimate purposes—e.g., to recover source code that has been lost—but it is often used for nefarious purposes, e.g., to search for security vulnerabilities in binaries or to steal intellectual property. This paper addresses the problem of...

متن کامل

A Survey on Tools for Binary Code Analysis

Different strategies for binary analysis are widely used in systems dealing with software maintenance and system security. Binary code is self-contained; though it is easy to execute, it is not easy to read and understand. Binary analysis tools are useful in software maintenance because the binary of software has all the information necessary to recover the source code. It is also incredibly im...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014